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Abstract 

A procedure for unfolding the true distribution from experimental data is 
presented. Machine learning methods are applied for simultaneous identifi- 
cation of an apparatus function and solving of an inverse problem. A priori 
information about the true distribution from theory or previous experiments 
is used for Monte-Carlo simulation of the training sample. The training 
sample can be used to calculate a transformation from the true distribution 
to the measured one. This transformation provides a robust solution for 
an unfolding problem with minimal biases and statistical errors for the set 
of distributions used to create the training sample. The dimensionality of 
the solved problem can be arbitrary. A numerical example is presented to 
illustrate and validate the procedure. 
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1. Introduction 

An experimentally measured distribution differs from the true physical 
distribution because of the limited efficiency of event registration and the 
finite resolution of a particular set-up. To identify a physical distribution, 
an unfolding procedure is typically applied [lH|. Unfolding is an under- 



specified problem. Any approach to solving the problem requires a priori 
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information about the solution. Methods for unfolding differ, directly or 
indirectly, in the use of this a priori information. 

Unfolding when the apparatus function or transformation model for a 
true distribution from the measured one is unknown has been considered 
previously 0, In this paper these ideas are further developed and the 
problem of simultaneously identifying a transformation model and inverse 
problem is solved. To obtain a robust solution for an unfolding problem, 
information about the shape of the distribution to be measured is used to 
create a training sample in Monte-Carlo simulations of an experiment. An 
approximation of the apparatus function is calculated for the set of distribu- 
tions for the training sample. Use of this type approximation can minimize 
the statistical errors and biases of the unfolded distribution for distributions 
used to create the training sample. There is no restriction on the size and 
shape of bins, linearization of the problem is simple (if the set-up has non- 
linear distortions), and multidimensional data can be unfolded. A machine 
learning approach provides a method for validating the unfolding procedure 
and for improving the results. 

The remainder of the paper is organized as follows. In Section 2 the main 
equation for solving an unfolding problem is proposed. A formal method for 
solving the unfolding problem and estimating the statistical errors for the 
unfolded distribution is discussed. Section 3 presents the algorithm for cal- 
culating the transformation matrix. In Section 4 the overall unfolding proce- 
dure is described. This consists of bin choice, system identification, solution 
of the basic equation and validation of the unfolding procedure. Section 5 
presents a numerical example. For comparison, an example reported else- 
where is used j2[ [sl, 0]. To investigate biases in the unfolding distribution, 
a numerical experiment with 1000 runs is performed. The results show that 
biases for the unfolded distribution is small. To demonstrate the robustness 
of the unfolding method for distributions used to create the training sample, 
the same investigation is performed for eight distributions randomly chosen 
from training sample. The results reveal that there are small biases and low 
statistical errors for all the unfolding distributions, which confirms that the 
procedure is robust. Statistical errors are as small as possible in all cases 
because of application of the least mean square method and the method for 
system identification. 
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2. Main equation 

In this work we use a linear model to transform a true distribution to the 
measured one: 

f=P4> + e, (1) 

where / is an m-component column vector of an experimentally measured 
histogram, P is an m x n matrix, with m > n, is an ra-component vec- 
tor of some true histogram and e is an m-component vector of random 
residuals with expectation value Ee = and a diagonal variance matrix 
Z = Vare = diag(crf , cr|, • • • ,cr^), where cij is the statistical error of the 
measured distribution for the ith bin. The linear model is reasonable for 
the majority of set-ups in particle and nuclear physics. It is only an approx- 
imate model for set-ups with a non-linear transformation from a true to a 
measured distribution. 

A least squares method ^] can give an estimate of the true distribution 

0, 

0=(P'Z-ip)-ip'Z-7, (2) 

where (f>, the estimate, is the unfolded distribution and the variance matrix 
of the unfolded distribution A is given by 

A = Var0= (PT-^P)-^ (3) 

The diagonal element 6fj^ of the matrix is the variance of component 0^ of the 
unfolded vector and is the statistical error. 



3. Identification of the transformation model 

To realize the scheme described in Section 2, the matrix P must be defined. 



This problem can be solved using system identification methods [IJ, [15 
System identification can be defined as a process for determining a model of 
a dynamic system using observed input-output data. In our case, this is the 
model for transforming a true physical distribution into the experimentally 
measured distribution, represented by the matrix P. Monte-Carlo simulation 
of a set-up can be used to obtain input-output data. Control input signals 
are used for system identification. The most popular choice is to use impulse 



control signals [IJ, |15 



An impulse input control signal is a generated (input) distribution in 
which the histogram with n bins has only one bin with non-zero content. For 
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model dl]), there are n different impulse inputs that can be presented as the 
diagonal matrix = diag(0^j^, 0225 0nn); "where each row contains the 
content from a generated histogram. Denote the corresponding values of the 
ith. component of the reconstructed (output) vector as /f = (/fi/f2 ■ ■ ■ fim)'- 
Each element of the ith row of the matrix 



/ Pii 



Pl2 



Pin 



\ 



Pil Pi2 



Pir. 



\ Pml Pm2 ■ ■ ■ Pmn j 

can be found from the equation 



(4) 



where Pi = {piiPi2 ■ ■ ■Pm)' , and Pij = fij/(pjj- Equation ([2]), with the matrix 
P calculated in this way, gives a highly fluctuating unfolded function with 
large statistical errors. In addition, it is possible that the matrix P'Z~^P is 
singular, in which case a solution does not exist. The effect of this type of 
instability is well known. There are many methods for solving this type of 
system, all of which use a priori information to obtain a stable solution to 
Eq. (dD. 

For system identification, instead of using impulse control distributions, 
we use a training sample of distributions based on a priori physically mo- 
tivated information that may be known from theory or from some other 
experimental data. 

Assume that we have a training sample with q generated (input) distri- 
butions and presented as a g x n matrix 



0' 



( 



\ 



'11 



'12 

%2 



'2n 



'ql 



'q2 



qn 



J 



where each row represents a generated histogram content. For each ith row 



of the matrix P, we can write the following equation [14 



(5) 
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where Pi = {piiPi2 ■ ■ -Vin)' i fi is a g-component vector of the content for the 
reconstructed (output) ith bin for different generated distributions, and 
is a g-component vector of random residuals with expectation value E^j = 
and a diagonal variance matrix fi = Var^j = diag(7j^]^, ■ ■ ■ ,7^^), where •jij 
is the statistical error for the reconstructed distribution for the zth bin and 
the jth generated distribution. Formally a least squares method gives an 
estimate for pi,i = 1, ... ,m: 

p, = {<\>^'rr'<^T'^''^r'n. (6) 

The whole matrix P is found by producing calculations defined by formula 
(Q for all rows. 

Similarity of shapes of distributions of the training sample leads to high 
correlations between columns of matrix O'^. This means that transformation 
of generated distribution to the ith bin of the reconstructed distribution can 
be parameterized using the subset of elements of row pi. Elements of a row 
that do not belong to the subset are set to 0. 

The training sample contained copies of the same distribution is example 
of the singular case of the similarity. The transformation can be reduced to 
only one non-zero element of vector Pi for this example. 

Another example is the training sample that contains any possible distri- 
butions. The number of non-zero elements cannot be reduced and matrix P 
coincides with matrix calculated using impulse control signals. 



A forward stepwise regression algorithm can be used [12[ to find non-zero 
elements of a row pi. Stepwise algorithm combines FS and BE steps. Steps 
are followed by each other and repeated until the process is terminated. Steps 
are defined as: 

Step FS. Suppose there is / elements of row i included into the model of 
transformation. Subvector of elements Pi{l) is calculated according to for- 
mula with submatrixes ^'^{l) and ri(/) that correspond to this subvector. 
A new element is added if: 

^^^^(n-/-l)>F,„ (7) 

where 

and Fin is constant (threshold). 

Step BE. Let there be / elements of row i included into model of transfor- 
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mation then an element is excluded from model of transformation if: 



X? 



'n-l)< Font (9) 



where Font is another constant. 

The algorithm is terminated when there cannot be found any elements 
that satisfy inequality (I7j) or inequality (E]). Good results give thresholds 



Fin = Font = 3.29 that have some theoretical background, see [13j. Position 
k of the first element pik for our case is defined by the maximum value of the 
correlation between vector and columns of matrix O'^: 

Cor(/f , 0^) = max[Cor(/f , 05), Cor(/f=, 0^), . . . , Cor(/f , C)], (10) 

where 0^ = (0^02^ • • • 0^' ■ 

The whole matrix P is found by stepwise algorithm calculations for all 
rows. 

It is possible that for each row exist more than one subset of non-zero 
matrix elements that describe the transformation in a sufficiently good man- 
ner. This case can be, for example, when all distributions of training sample 
are rather close to each other. Thus, for each ith reconstructed bin we 
will have a set of A^, candidate rows, and for all reconstructed bins a set of 
A^^i X A'"2 X ■ ■ ■ X Nm candidate matrices P. We need to choose a matrix P 
that is good or optimal in some sense. The most convenient criterion in our 



case is D-optimality [16|, which is related to minimization of 

det(PT-ip)-i = det(Var (0)) . (11) 

There are many algorithms and programs for minimization of ffTTj) . The 
matrix P that minimizes function f lTT]) gives a stable solution to unfolding 
problem ([2]) with a minimum volume for the confidence ellipsoid. 

There are three possibilities to further improve the quality of the solution: 

1. Introduce selection criteria for models of distributions used to create 
the training sample. The previously described goodness-of-fit test can 



be used for this purpose [17 



2. Each training distribution has a reconstructed distribution that can 
be compared with the experimentally measured distribution using a 



X test [18|. Improvement is achieved by selecting distributions for 
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3. 



the training sample that satisfy < a, where is the test statis- 
tic for comparison of the reconstructed and experimentally measured 
distributions 10|. The threshold a defines how close a reconstructed 
distribution is to the experimental distribution. Note that any thresh- 
old a corresponds to a particular significance level for the test. It is 
reasonable that a decrease in parameter a represents a decrease in bias 
and statistical error for the solution. 
A leave-one-out validation procedure 



19l |20[ for q runs can be per- 



formed. During a run the unfolding procedure is applied for each of q 
a reconstructed distributions. Each unfolded distribution is then com- 
pared with the corresponding generated distribution using a test 



A boosting procedure [19|, |20 



can be used for distributions of 
the training sample with a low p- value. This involves adding to the 
training sample the same distribution with a statistically independent 
realization of the corresponding reconstructed histogram. 



4. Unfolding procedure 

This section provides a description of the complete unfolding procedure. 
The procedure can be divided into four parts: initialization, system identifi- 
cation, solution of the basic equation, and validation. 



Initialization 

• Define the binning for the experimental (reconstructed) data. The strat- 
egy for selecting the bin size involves starting with a large bin size and 
then increasing the number of bins incrementally until the error for the 
unfolded distribution stops decreasing. 

• Define the binning for the unfolded (generated) distribution. The bin 
size should be chosen by picking a reasonably large size first and then 
decreasing the size in further steps until the correlation between ad- 
justed bins becomes too large. The number of bins for an unfolded 
distribution, n, must be less than the number of bins for the experi- 
mentally measured distribution, m, because the least squares method 
is used to solve the main equation. 
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System identification 



• Choose a training sample of generated distributions. Generated dis- 
tributions for tlie training sample must be cliosen as described in tlie 
previous section. A second iteration can be made to find a better set of 
distributions. Tlie number of generated distributions must be greater 
tfian the expected number of non-zero elements in any row of matrix 
P (for reasons related to use of the least squares method). 

• Calculate the matrix P. The matrix is calculated according to the 
algorithm described in the previous section. 

• Calculate the D-optimal matrix P. Optimization can be performed us- 
ing Fedorov's reliable EA algorithm jl6| with the initial matrix P cal- 
culated in the previous step. 

Solution of the basic equation 

• Calculate the unfolded distribution Eq. ^ with the variance matrix 
Eq. The correlation matrix calculated from the variance matrix 
can give hints for improved binning of the unfolding distribution. For 
example, if the correlation between two adjacent bins is high, they 
should be combined. 

Validation of the unfolding procedure 

• Fit the unfolded distribution, and then use this fit to generate a recon- 
structed distribution (including the effects of resolution and acceptance) 
to compare with the real data. 



• Leave- one- out procedure \l^.\2W for q runs. During a run, the unfolded 
procedure is applied for each of q reconstructed distributions. The un- 
folded distributions are then compared with the corresponding gener- 



ated distributions 12 



This procedure yields an unfolded distribution with minimal statistical 
errors and minimal bias for the true distributions closed to distributions of 
the training sample. This follows from the properties of the least mean square 
method and the method used to calculate the transformation matrix P. 
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5. A numerical example 

The method described above is now illustrated with an example proposed 
by Blobel ^ and used for illustration elsewhere jsi 0| . We take a true distri- 
bution 

with the same parameters as in a previous study (Table 1, first row , 
where x is defined on the interval [0, 2]. 



Table 1: Values of the parameters and intervals used for training sample simulations 





A2 


^3 


Bi 


B2 


Bs 


Ci 


C2 


Cs 


1 


10 


5 


0.4 


0.8 


1.5 


2 


0.2 


0.2 


[0.5,3] 


[6,14] 


[1,9] 


[0.2,0.6] 


[0.6,1.3] 


[1.3,2] 


[0.5,3.5] 


[0.1,0.4] 


[0.1,0.5] 



An experimentally measured distribution is defined as 



f{x)= [ (f){x')A{x')R{x,x')dx' (13) 
Jo 

where the acceptance function A{x) is 

Aix) = 1 - (14) 

and 

1 (3;-a;- + 0.05a:-2)2 
R{x, X ) = — =^exp( — ) (15) 



is the detector resolution function with a = 0.1. The acceptance and resolu- 
tion functions are shown in Fig.l. 

A histogram of the measured distribution / was obtained by simulating 
5000 events with m = 70 bins, as shown in Fig. 2. 

For the true distribution histogram, we chose 12 bins of the same size as 
in a previous study jH . Fig. 3 shows the histogram of the simulated true dis- 
tribution. For detector identification we used a training sample comprising 
100 distributions defined by formula (fT2|) with parameters simulated accord- 
ing to uniform distributions on the intervals represented in Table 1. Fig. 4 
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Figure 1: The acceptance function A{x) and resolution function R{x,x') for x' = 0.5, 1.0 
and 1.5. 




Figure 2: The nieasured distribution f{x) (number of events divided on bin size). The 
true distribution (j){x) is shown as curve. 
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Figure 3: The histogram of the simulated true distribution. The true distribution ^(a;) is 
shown as curve. 




Figure 4: The first 50 distributions for the training sample. 
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n I — , — , — , — , — I — , — , — , — , — I — , — , — , — , — I — , — , — , — , — I 

"O 0.5 1 1.5 2 

X 

Figure 5: The unfolded distribution (j){x). The true distribution (p{x) is shown as curve 

shows 50 of the 100 true distributions used for identification. Histograms of 
the measured distribution were obtained by generating 5000 events. 

Matrix calculation was performed without D-optimization. Tabic 2 shows 
the position of non-zero elements of matrix P. Elements of the matrix that 
are not close to elements defined by the greatest correlation are rather small. 
The maximum number of elements in each row of matrix P that essentially 
defines the transformation is three. The matrix has approximately 20% non- 
zero elements. 



Table 2: Matrix P' , whore (•) denotes non-zero elements 

•••••ooooooooooooooo»ooooooooo»ooooooooooooooo»ooooooooooooooooooooooo 
•oooo««»ooooo»ooooooooooooooooooooooooooooooooooooooooooooo»oooooooooo 
ooo««ooo»»»«»«»»oo»oooo»»ooooooo»oooooooooooooooooo»ooooo»oooooooooooo 
oooooooooooo««««»»»«»«»ooooo»oooooooooooooooooo»oooooooooooooooooooooo 
ooooooo»o»ooooo«oo»«««»»«»«»»«««»o»ooooooooooooooooo»oooooo»ooooooo»oo 
ooo««»oooooooooooooooooo«»«»»«««»»»«»»oooooooooo«ooooooooooooooooooooo 
ooooooo»oooooooooooooooooooooo««»»»«»»»««»»o«ooooooooooooooooooooo»ooo 
ooooooooooooooooooooooooooooo»oooo««»»»«»»»»««»»ooooo««oooooooooo»oooo 
oooooooooooooooooooooooooooooooooo«oooooo»»o««»»«««»»ooooooooooooooooo 
•ooooooooooooooooooooooooo»ooooooo«ooo»»ooo»o«»»o»»»»««»»»«»»ooooooooo 
oooooooooooooooooooooooooooooooooooooooooooooooooooo»o»»»»»»»»»»»»»ooo 
ooo»oooooooo»oo»oooooooooo»oo»oo»oooooooooooooooooooooo«oo«»«»»»«»»»»» 



Fig. 5 shows the unfolded distribution and the true distribution as a solid 
line. Comparison shows that the unfolded distribution basically reflects the 
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fluctuations of tlie true distribution (see Fig. 3), but tlie statistical errors are 
greater. Table 3 presents errors and the correlation matrix for the unfolded 
distribution components. Errors are denoted as da because they are only 
estimates of the error da. 



Table 3: Errors da and correlation matrix for the unfolded distribution (j){x) 





^ii 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


1 


83 




0.3 


0.0 


0.0 


0.0 


0.0 


-0.1 


0.0 


0.0 


0.1 


0.0 


0.0 


2 


140 


0.3 




0.1 


0.0 


0.0 


0.1 


-0.1 


0.0 


0.0 


0.0 


0.0 


0.0 


3 


110 


0.0 


0.1 




-0.1 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


4 


190 


0.0 


0.0 


-0.1 




-0.3 


0.1 


-0.1 


0.0 


0.0 


0.0 


0.0 


0.0 


5 


270 


0.0 


0.0 


0.0 


-0.3 




-0.5 


0.3 


-0.1 


0.1 


0.0 


0.0 


0.0 


6 


320 


0.0 


0.1 


0.0 


0.1 


-0.5 




-0.6 


0.3 


-0.1 


0.0 


0.0 


0.0 


7 


300 


-0.1 


-0.1 


0.0 


-0.1 


0.3 


-0.6 




-0.5 


0.0 


0.1 


0.0 


0.0 


8 


210 


0.0 


0.0 


0.0 


0.0 


-0.1 


0.3 


-0.5 




-0.2 


-0.1 


0.1 


0.0 


9 


200 


0.0 


0.0 


0.0 


0.0 


0.1 


-0.1 


0.0 


-0.2 




-0.4 


0.1 


0.0 


10 


210 


0.1 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


-0.1 


-0.4 




-0.4 


0.2 


11 


160 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


0.1 


-0.4 




-0.3 


12 


140 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.2 


-0.3 





To investigate the statistical properties of the unfolding procedure, 1000 
simulation runs were performed to produce 1000 statistically independent 
measured histograms for the same true distribution [121 The unfolded distri- 
bution was calculated for each measured distribution. The same matrix P 
was used for all cases. The following quantities were calculated: 

• Exact value of the components of the true distribution 

(pi = 5000 J^*"''^ (t>{x)dx/ (xj+i — Xi) where Xi+i and Xi are the bounds of 
ith bin. 

• Average value of the components of the unfolded distribution 
<P^ = EJ=l<P^iJ)nOOO, where j is the run number. 

• Bias for components of the unfolded distribution 

B(/)i = (pi- (pi 

• Standard deviation Si for the unfolded distribution components 
^. = \/Ej=T(0..(i)-^?)/999. 
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• Mean estimated error Sa for the unfolded distribution components 

^.. = E]lT'^n(j)/iooo. 

• Bias for errors in the unfolded distribution components 

BSii Si Sii. 



Table 4: Exact values of components of the unfolded distribution average values (jii, 
bias i? (/),;, standard deviation Si, mean error 5n and bias for calculated errors BSu 



i 


4>i 


<Pi 


B^i 


Si 


Sii 


BSii 


1 


913 


968 


55 


82 


84 


-2 


2 


1152 


1213 


62 


125 


133 


-8 


3 


1631 


1666 


36 


117 


116 


1 


4 


2760 


2766 


7 


169 


167 


2 


5 


4941 


4897 


-44 


265 


252 


12 


6 


5011 


4957 


-54 


309 


303 


7 


7 


3018 


3070 


53 


292 


298 


-6 


8 


2284 


2379 


95 


173 


177 


-4 


9 


2718 


2770 


53 


205 


199 


6 


10 


3073 


2989 


-83 


210 


213 


-3 


11 


1778 


1776 


-3 


160 


162 


-2 


12 


997 


1037 


40 


129 


138 


-9 



The results presented in Table 4 and Fig. 6 show that the bias and 
statistical errors are small. Visual comparison of the unfolded distribution 
demonstrates the superiority of the present result over previous results 0]. 
Comparison of biases is not possible because this has not been reported in 
any literature on unfolding methods. 

To demonstrate that the algorithm is robust, eight sets of parameters 
(Table 5) were randomly simulated according to uniform distributions on 
the intervals represented in Table 1. For each set, a random experiment with 
1000 runs was performed using matrix P defined in the first example. The 
results presented in Fig. 7 demonstrate the robustness of the method, with 
rather low bias for the unfolded distribution in all eight cases. 
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Figure 6: Average values of components of the unfolded distribution Vertical bars 
denote the mean error Su and circle centers (©) denote the exact value of components ^j. 
The true distribution (j){x) is shown as curve 



Table 5: Values of the parameters chosen for numerical experiments 







A2 


As 


Bi 


B2 


B3 


Ci 


C2 


Cs 


1 


1.38 


8.85 


5.19 


0.52 


0.93 


1.79 


1.61 


0.30 


0.36 


2 


0.56 


13.18 


4.05 


0.22 


0.78 


1.91 


1.26 


0.16 


0.45 


3 


0.55 


9.97 


5.67 


0.32 


1.20 


1.64 


3.02 


0.35 


0.20 


4 


2.21 


9.28 


3.79 


0.48 


0.67 


1.61 


2.11 


0.20 


0.48 


5 


2.77 


9.02 


7.61 


0.49 


0.64 


1.72 


3.35 


0.17 


0.41 


6 


1.66 


7.94 


1.18 


0.39 


1.09 


1.43 


2.44 


0.30 


0.30 


7 


1.19 


9.06 


6.88 


0.51 


1.06 


1.81 


1.87 


0.11 


0.25 


8 


1.31 


7.13 


7.97 


0.31 


0.77 


1.41 


3.23 


0.17 


0.34 



6. Discussion and conclusion 

The main difficulties of the unfolding problem, which is a particular case 
of the inverse problem, are widely known. Information is lost in measuring 
owing to the inefficiency of registration in the frequency domain because of 
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Figure 7: Results of numerical experiments for the eight distributions defined in Table 5 
from left to right and top to bottom. Graphs show average values of components of the 
unfolded distributions ^i. Vertical bars denote the mean error Su, and circle centers (0) 
denote the exact value of components (pi . 
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the low-pass filter defined by the resolution function and to the inefficiency 
of events registration defined by the acceptance function of the set-up. Thus, 
there are an infinite number of true distributions that give the same measured 
distribution and therefore a priori information about the solution must be 
used to solve an unfolding problem (inverse problem). 

One way to solve an unfolding problem is to replace the original problem 
by a problem for a smoothed original true distribution and to use a sliding 
window (bin) for a smoothing. This is equivalent to solving the unfolding 
problem for the true distribution in some binning. Smoothing is low-pass 
filtering and the loss of information for a smoothed distribution due to the 
resolution function effect, which is another low-pass filter, is lower than for 
the original true distribution. Solution of the unfolding problem is easier, 
but no information is obtained about the structure of the original true dis- 
tribution inside the bin. 

In practical applications of the unfolding procedure, the transformation 
matrix P must be calculated. Simulation of the measurement process is used 
for this, especially in nuclear and particle physics. This process is very time- 
consuming and the sample size for simulated events is often of the same order 
as for measured events. The calculated matrix will have many noisy matrix 
elements in this case, which is another source of instability in solving the 
inverse problem. 

Main points related with difficulties of the unfolding problem have formu- 
lated above on physical level of rigor permit us summarize results of given 
paper and define place of proposed unfolding method among other known 
methods. 

The method presented here is a completely new approach to unfolding 
problems using machine learning concepts, including a training sample, a val- 
idation procedure and boosting. All a priori information about the solution 
is contained in the training sample, which is a set of physically motivated 
true distributions known from theory and other experiments. Methods for 
selecting distributions for the training sample were presented in Section 3 



In the proposed method, an unfolded distribution can be calculated for 
a grid of points or for bins. There are no restrictions imposed by the di- 
mensionality of the problem or the configuration of the bins or the grid. The 
method for identification provides a linear approximation of a transformation 
from the true distribution to the measured distribution if this transformation 
is non-linear. 



and are supported by previous research [17|, 
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The numerical example presented demonstrates the robustness of the new 
unfolding procedure and the possibility of unfolding a whole set of distribu- 
tions with a single calculated matrix for transformation P. The set is defined 
as distributions used to create the training sample. Biases and statistical 
errors for components of the unfolded distribution were calculated using a 
Monte-Carlo method with 1000 runs. The examples demonstrate that the 
bias is small for components of the unfolded distribution and for estimates of 
the statistical errors. It should be noted that such biases were investigated 
for unfolding for the first time. The unfolding procedure is validated using 
a machine learning approach and has a good statistical interpretation. The 
proposed method has wide potential for applications in nuclear and particle 
physics, where models for training samples can be proposed and Monte-Carlo 
simulations can be used to calculate transformation matrices. 
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